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II. RELATED APPEALS AND INTERFERENCES 

Appellants know of no other appeals or interference proceedings related to the present appeal. 

III. STATUS OF CLAIMS 

Claims 1-26 on appeal have been finally rejected under 35 USC § 102(e) as anticipated by 
U.S. Patent 6,236,395 to Sezan et al. (hereafter " Sezan et al "). 

IV. STATUS OF AMENDMENTS 

All amendments have been entered, including the Amendment After Final Rejection filed 
March 18, 2004. 

V. CLAIMS ON APPEAL 

A clean copy of claims 1-26 on appeal is attached hereto as Exhibit A. 

VI. SUMMARY OF THE INVENTION 

The present invention generally relates to a method of describing the features of compressed 
or uncompressed audio data and a method of constructing the feature description collection of 
compressed or uncompressed audio video data (specification, page 1, lines 6-9). 

Claims 1 - 1 8 on appeal are directed to audio features which are hierarchically represented by 
setting an audio program which means entire audio data constructing one audio program as a highest 
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hierarchy and describing the audio features in a order from higher to lower hierarchies, said 
hierarchies being represented by at least one audio program having a semantically continuous content 
and at least one of an audio scene and an audio shot, and said hierarchies being described by at least 
names of the hierarchies, audio data types, feature types and feature values described by audio 
segment information classified according to the feature types (Figs. 1-3; specification, page 9, line 
24 to page 11, line 22). 

According to these features, compressed or uncompressed audio data can be described 
hierarchically by using the novel method of the present invention. Besides, it is possible to provide 
compressed or uncompressed audio features description capable of high-speed, efficiently searching 
or inspecting audio data (specification, page 18, line 27 to page 19, line 1 1). 

Claims 19, 21, 23 and 24 on appeal are directed to a compressed or uncompressed audio 
video feature description collection construction method, wherein feature descriptions based on 
multiple feature types are associated with each audio video program; the feature descriptions are 
extracted from multiple audio video programs based on a specific feature type; a feature description 
collection is conducted by using multiple extracted feature descriptions; and the feature description 
collection is described as a feature description collection file (Fig. 13; specification, page 19, line 
26 to page 20, line 19). 

Claims 20, 22 and 25-26 on appeal are directed to an embodiment of the present invention 
in which the feature type is a summary type; summary descriptions associated with the individual 
audio video programs are extracted from multiple audio video programs based on a specific 
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summary type; a summary collection is conducted using multiple extracted summary descriptions; 
and the summary collection is described as a summary collection file (Fig. 14; specification, page 
20, line 20 to page 21, line 13). 

VII. THE ISSUE 

1 . The sole issue on appeal is whether the invention, as recited in Appellants' claims 1 - 
26 on appeal, are anticipated by Sezan et al. under 35 USC § 102(e). 
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VIII. GROUPING OF THE CLAIMS 

Rejected claims 1-18 on appeal should rise or fall together because they are all directed to 
a compressed or uncompressed audio data feature description scheme. 

Rejected claims 19-26 on appeal should rise or fall together because they are all directed to 
a compressed or uncompressed audio video data feature collection description scheme. 

IX. ARGUMENT WITH RESPECT TO THE ISSUES 
A. THE REFERENCES 

The sole prior art reference applied by the Examiner to reject the claims is Sezan et al. 

Sezan et al. discloses an audiovisual information management system including at least one 
description scheme. For audio and/or video programs a program description scheme provides 
information regarding the associated program. For the user a user description scheme provides 
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information regarding the user's preferences. For the system a system description scheme provides 
information regarding the system. The description schemes are independent of one another. 
Preferably, the program description scheme, user description scheme, and system description scheme 
are independent of one another. 

B. SUMMARY OF EXAMINER'S REJECTIONS 

In the Office Action of November 11, 2003, the Examiner rejected claims 1-26 on appeal 
under 35 USC § 102(e) as anticipated by Sezan et al. 

As to claim 1 on appeal, the Examiner urges that Sezan et al. teaches an audio data feature 
description method, comprising the step of: 

hierarchically representing an audio features where the (audio or video) program is at the 
highest hierarchy, segmenting the program into hierarchies and representing each segment with 
segment descriptors/features (Fig.3;Figs. 13, 16-21; Col. 14, line45-Col. 26, line 28; Col. 27, lines 
12-3). 

As to claim 2 on appeal, the Examiner urges that Sezan et al. teaches semantically 
representing scenes or shots of audio programs (Fig. 13). 

As to claims 3-6 on appeal, the Examiner urges that Sezan et al. teaches where the 
descriptors reflect content and value of the audio data and where the segments are described with key 
frames and time codes (Col. 4, lines 59-65; Figs. 3-12). 
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As to claim 7 on appeal, the Examiner urges that Sezan et al. teaches an audio/video program 
description, wherein the feature values are represented by an audio thumbnail indicating audio pieces 
or images, and where the thumbnail is described according to the feature type and where the 
audio/video program is described in segments (Figs. 4-5). 

As to claim 8 on appeal, the Examiner urges that Sezan et al. teaches where feature values 
are represented by a clip having arbitrary length (Figs. 10-11). 

As to claim 9 on appeal, the Examiner urges that Sezan et ah teaches where a clip 
representing audio shots or scenes is represented as key audio clip/key-frame (Fig. 14). 

As to claims 10-12 on appeal, the Examiner urges that Sezan et al. teaches where clips are 
represented by a plurality of object descriptions (Figs. 13, 15 and 20). 

As to claim 1 3 on appeal, the Examiner urges that Sezan et al. teaches an audio data program 
description, where the data consists multiple channels represented as key streams and where an audio 
segment corresponding to the key stream is described (Figs. 4-12). 

As to claim 14 on appeal, the Examiner urges that Sezan et al. teaches where audio segments 
are described in events (Fig. 3; Fig. 11; Fig. 13, 480). 

As to claim 15 on appeal, the Examiner urges that Sezan et al. teaches where the program 
description comprises object description scheme (Fig. 13, 482). 



6 



U.S. Patent Application Serial No. 09/730,607 
Appeal Brief 



As to claims 1 6 and 1 7 on appeal, the Examiner urges that Sezan et al. teaches an audio data 
description method, where a representative of audio shot or scene is represented as sequences of slide 
(Fig. 8). 

As to claim 1 8 on appeal, the Examiner urges that Sezan et al. teaches an audio data feature 
description method where features of an audio program are segmented and described and value is 
produced indicating the level of the feature (Figs. 3 and 13; Fig. 14, 426). 

As to claim 19 on appeal, the Examiner urges that Sezan et al. teaches an audio video data 
description method, wherein 

feature type description are extracted and associated with programs; and 

feature descriptions are extracted from one or more audio video programs and organized into 
meta description data (Figs. 3, 6 and 16). 

As to claims 20-22 on appeal, the Examiner urges that Sezan et al. teaches where the feature 
types include a summary type (Col. 34, lines 36-42) and where multi-level summary collections are 
generated and feature identifiers are included (Col. 8, liens 33-48; Fig. 6). 

As to claims 23-26 on appeal, the Examiner urges that Sezan et al. teaches where the feature 
descriptions structures are generated according to contents and summary types (Fig. 13, 402, 404; 
Fig. 3,64). 
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C. APPELLANTS' ARGUMENT 

Claims 1-26 on appeal are patentable over SezanetaL under 35 USC §102(e). 

For claims to be rejected under 35 USC §102, all elements recited in the claims must be 
disclosed in a single applied reference. 

The Examiner has urged that column 14, line 45 - column 26, line 28 and column 27, lines 
12-43 describe an audio data feature description method, wherein audio features are hierarchically 
represented by setting an audio program which means entire audio data constructing one audio 
program at the highest hierarchy and describing the audio features in order from higher to lower 
hierarchies. 

Appellants respectfully disagree. Neither of these passages and none of the drawings show 
such a hierarchical representation of audio features. 

The "program description scheme" in Sezan et al. only means a type of description scheme. 
Sezan et al. does not teach an audio data feature description method, wherein audio features are 
hierarchically represented by setting an audio program, which means entire audio data constructing 
one audio program at the highest hierarchy and describing the audio features in order from higher 
to lower hierarchies. 

The Examiner has referred to various Figures for showing the elements recited in the other 
independent claims 7-8 and 13-19 on appeal. 
Appellants respectfully disagree. 
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1 . Contrary to the Examiner's assertion, Figs. 4-5 fail to relate to audio thumbnails, as 
recited in claim 7. 

Figs. 4 and 5 in Sezan et al. teach only an interface for selecting programs. The "Thumbnail 
View" in column 15, line 39 of Sezan et al. appears to relate to a description of thumbnails, but 
Sezan et al. fails to teach "audio pieces" and "describing audio segment information of audio pieces 
as feature type", as recited in claim 7 on appeal. 

2. Contrary to the Examiner's assertion, Figs. 10-11 do not relate to the relationship 
between audio scenes, audio pieces or audio shots, as recited in claim 9 on appeal. 

Figs. 10 and 1 1 in Sezan et al. only teach an interface for reading programs. Sezan et al. 
fails to teach "audio shot" and "feature value of the audio shot are represented by san audio piece". 
It appears that "Highlight View" and "Event View" in column 1 6 correspond to those of Figs. 1 0 and 
1 1 , but the "Highlight View" and "Event View" comprise an identifier of start-frame, end- frame, and 
display-frame. They are representative sections extracted from a program. They differ from the 
present invention, representing feature values of one audio scene or one audio shot by an audio clip. 

3 . Contrary to the Examiner' s assertion, Figs. 4- 1 2 do not relate to audio data consisting 
of multiple channels or tracks, as recited in claim 13 on appeal. 

The Examiner appears to misunderstand what is meant by "channel". The "channel" in 
Sezan et al. means TV channel which includes multiple contents. It is clear that a "key stream" of 
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the present invention is not taught in Sezan et al . A "channel" of the present invention means 
multiple audio data such as multiple languages, deputy sound, etc. included in a single content. 

4. Contrary to the Examiner's assertion, Figs. 3, 1 1 and 1 3 do not disclose a key event; 
that the content of the key event is described by text information; or that at least one 
audio segment corresponding to the key event is described, as recited in claim 14 on 
appeal. 

Although claim 14 on appeal appears to relate to "Event Profile" in column 16 of Sezan 
et al. . the "Event Profile" is different from the present invention. The "Event Profile" describes an 
event on video with "duration" comprised by two values of start- frame-id and end-frame-id, and adds 
a text and audio as an attached information. However, the present invention relates to an event on 
the audio itself. This is not taught by Sezan et al . Moreover, Sezan et al. does not teach that the 
key event is described as audio duration. 

5. Contrary to the Examiner's assertion, Fig. 13 does not disclose a key object; that the 
content of the key object is declared and described by text information; on that at 
least one audio segment corresponding to the key object is described, as recited in 
claim 15 on appeal. 

Although it appears that claim 1 5 on appeal relates to "Object Profile" in column 20 of Sezan 
et al. , the "Event Profile" is different from the present invention. The "Object Profile" describes an 
object on video with "duration" comprised by two values of start-frame-id and end-frame-id, and 
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adds a text and audio as an attached information. However, the present invention relates to an object 
on the audio itself. This is not taught by Sezan et al . Moreover, Sezan et al. fails to teach that the 
key object is described as audio duration. 

6. Contrary to the Examiner's assertion, Fig. 8 does not refer to audio shots or audio 
slides, as recited in claims 16-17 on appeal, respectively. 

Fig. 8 only teaches an interface for reading programs. It appears that Fig. 1 8 corresponds to 
"Shot View" in column 15 and that "Slide View" is similar to the present invention. "Shot View" 
comprise start-frame-id, end-frame-id and display-frame-id, and "Slide View" comprises a line of 
frames. Neither represents audio duration. Sezan et al. does not teach the limitations of claims 16 
and 17 on appeal, namely, an audio program is represented as audio slide comprised by audio 
segment or audio file, the audio slide is declared and described as a feature type, and its feature value 
is described with audio segments or audio files. 

7. Contrary to the Examiner's assertion, Figs. 3, 1 3 and 14 do not show that audio data 
for multiple feature types are described hierarchically according to the level values, 
as recited in claim 1 8 on appeal. 

A symbol 426 in Fig. 14 does not teach hierarchical description depending on level values. 
The symbol 426 seems to relate to "Key Frame View", including "Clip" comprised by three values 
of start-frame-id, end- frame-id and display- frame-id. This is different from claim 18 on appeal, 
which recites a description of audio feature type and hierarchical description of audio segment based 
on it. 
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8. The Examiner has urged that Figs. 3, 6 and 15 show feature descriptions extracted 
from one or more audio video programs and organized into meta description data. 
This does not relate to feature descriptions extracted from multiple audio video 
programs based on a specific feature type , and constructing a feature description 
collection by using multiple extracted feature description, as recited in claim 19 on 
appeal. 

Figs. 3, 6 and 15 do not teach claim 19 on appeal of the present invention. Namely, Sezan 
et al. selects and reads multiple programs by using metadata description, but it does not teach that 
new description groups are generated from the metadata description. However, claim 19 on appeal 
relates to generating feature description groups based on a specific feature type from metadata of 
each of multiple programs. 

Despite these arguments, in the Office Action of November 12, 2003, the Examiner urged, 
among other things, that Sezan et al. still teaches the hierarchical representation of audio features 
where the entire audio data corresponding to one audio program is set at the highest hierarchy and 
the audio features and described in order from higher to lower hierarchies. 

Appellants respectfully disagree. None of the passages in Sezan et al. cited by the Examiner 
pertains to hierarchies . Webster's New World Dictionary, Third College Edition, defines "hierarchy" 
as "a group of persons or things arranged in order of rank, grade, class, etc." The Examiner has 
stated: 
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According to Sezan the description scheme 406 provides for several 
different presentations (hierarchies) of video content (or audio), such 
as for example, a thumbnail view description scheme 410, a key 
frame view description scheme 412, a highlight view description 
scheme 414, an event view description scheme 416, a close-up view 
description scheme 418, and an alternative view description scheme 
420 Col. 26, lines 40-50. 

Thus, the Examiner has erroneously equated the various types of presentations in Sezan et al. 
to the hierarchical (by rank) representations of audio features recited in claim 1. Sezan et al. fails 
to teach, mention or suggest hierarchically ranking the various types of presentations, as in the 
present invention. 

In particular, the Examiner has stated that the program description scheme in Sezan et al. 
comprises arranging a program into sections and/or categories (hierarchies). Appellants respectfully 
disagree. The "sections" do not refer to divided parts of a program, but rather to divided parts of the 
"program description scheme." More, the "sections" are not elements of the program, but rather 
elements of the "program descriptspecificallyion scheme." Sezan et al. discloses that "the first 
section identifies the described program. The second section defines a number of views which may 
be useful in browsing applications. The third section defines a number of profiles which may be 
useful in filtering and search application." It is apparent that these sections are not a relation of 
hierarchies, but have equal status, as shown in the chart below. 
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program description scheme 



-section 1 (identification) 



section 2 (views) 

section 3 (profile) 

Regarding the high level and low level features and/or descriptors in Sezan et al. (col. 13, 
lines 33-36), Sezan et al. teaches that the former is easily readable by humans while the latter is 
more easily read by machines and less understandable by humans (col. 13, lines 38-40). So the high 
level and low level features and/or descriptors in Sezan et al. do not refer to hierarchies. 

The Examiner has stated, referring to Fig. 4 of Sezan et al. . that "selecting a particular 
category, such as news, provides a set of thumbnail views of different programs that are currently 
available for viewing. In addition, the different programs may also include programs that will be 
available at a different time for viewing. Sezan et al. teaches a set of thumbnail views for video, but 
not for audio. In contrast, the present invention is directed to audio. Because a set of thumbnails 
defined by an audio segment is not visible, a program selection interface, as shown in Fig. 4 of Sezan 
et al. could not be used for audio. Thus, the present invention differs from Sezan et al. . A set of 
audio thumbnails in the present invention is used for searching audio to which a user desires to 
listen. In other words, the user can easily find a desired audio by playing back the set of audio 
thumbnails, which cannot be accomplished by Sezan et al. . which is directed to video . 

The Examiner has stated that several different presentations are hierarchies. Appellants 
respectfully disagree. Referring to Fig. 14, Sezan et al. discloses the several different presentations 
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describe a number of views (col. 26, lines 39-5 1). These are no hierarchies among them, but instead 
only equal relations. 

Claims 1-8 and 16-17 on appeal recite this hierarchical representation of audio features. 

The Examiner has urged that Figs. 3, 6 and 1 5 show feature descriptions extracted from one 
or more audio video programs and organized into meta description data. This does not relate to 
feature descriptions extracted from multiple audio video programs based on a specific feature type , 
and constructing a feature description collection by using multiple extracted feature description, as 
recited in claim 19 on appeal. 

Thus, Figs. 3, 6 and 15 do not teach claim 19 on appeal. Namely, Sezan et al. selects and 
reads multiple programs by using metadata description, but it does not teach that new description 
groups are generated from the metadata description. However, claim 19 on appeal relates to 
generating feature description groups based on a specific feature type from metadata of each of 
multiple programs. 

Sezan et al. teaches that, in the case of a plurality of audio video programs, different feature 
types (A, B, C and D) are extracted from each of the audio video programs (1-3), and then are 
separately combined to make systematic feature collection descriptions for each of them (see the top 
drawing below). 

On the contrary, claim 19 on appeal teaches that a specific feature type (A) is extracted from 
each of a plurality of audio video programs (1-3), and then the specific feature types (A) are 
combined to make a systematic feature collection description for them (see the drawing below). 
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The distinctions between Sezan et al. and claim 19 on appeal are shown below: 
Sezan et al. 

audio video programs feature collection descriptions 



program 1 




Feature Type A 


Feature Type B 


Feature Type C 


Feature Type D 






program 2 




Feature Type A 


Feature Type B 


Feature Type C 


Feature Type D 






program 3 




Feature Type A 


Feature Type B 


Feature Type C 


Feature Type D 



Claim 19 on appeal 

audio video programs a feature collection description of the feature type A 



program 1 


>< 


•> 


program 2 




Feature Type A-l 


Feature Type A-2 


Feature Type A-3 



program 3 
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X. CONCLUSION 



For the above reasons, The Board of Patent Appeals and Interferences is therefore 
respectfully requested to reverse the Examiner's rejections of claims 1-26 on appeal and pass this 
application to issue. 

In the event this paper is timely filed, Appellant hereby petitions for an appropriate extension 
of time. The fee for any such extension may be charged to Deposit Account No. 01-2340, along with 
any other additional fees which may be required with respect to this paper. 

Respectfully submitted, 
ARMSTRONG, KRATZ, QUINTOS, HANSON & BROOKS, LLP 

William L. Brooks 
Attorney for Appellants 
Registration No. 34,129 
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IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 
BEFORE THE BOARD OF PATENT APPEALS AND INTERFERENCES 

In re the Application of: SUGANO, Masaru et al. 

Group Art Unit: 2655 

Serial No.: 09/730,607 

Examiner: D. D. ABEBE 

Filed: December 7, 2000 

P.T.O. Confirmation No.: 9246 

For: AUDIO FEATURES DESCRIPTION METHOD AND AUDIO VIDEO FEATURES 
DESCRIPTION COLLECTION CONSTRUCTION METHOD 

CLAIMS ON APPEAL J 

Commissioner for Patents 
P.O. Box 1450 

Alexandria, Va 223 13-1450 June 25, 2004 

Sir: 

The claims on appeal are 1-26, presented below. 

Claim 1 (previously presented): A compressed or uncompressed audio data feature 
description scheme, 

wherein audio features are hierarchically represented by setting entire audio data which 
corresponds to one audio program at the highest hierarchy and describing the audio features in order 
from higher to lower hierarchies. 



Claim 2 (previously presented): A compressed or uncompressed audio data feature 
description scheme according to claim 1 , wherein 

said hierarchies are represented by one or more audio programs having a semantically 
continuous content and at least either an audio scene or an audio shot. 
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Claim 3 (previously presented): A compressed or uncompressed audio data feature 
description scheme according to claim 1, wherein 

said hierarchy is described by at least a hierarchy identifier and a feature which includes an 
audio type, a feature type and audio segment information classified according to the feature types. 

Claim 4 (previously presented): A compressed or uncompressed audio data feature 
description scheme according to claim 2, wherein 

said hierarchy is described by at least a hierarchy identifier and a feature which includes an 
audio data type, a feature type and audio segment information classified according to the feature 
types. 

Claim 5 (previously presented): A compressed or uncompressed audio feature description 
scheme according to claim 3, wherein 

said audio segment information is described by any of a combination of start time code and 
end time code, a combination of start time code and duration, a combination of a start frame/sample 
number and an end frame/sample number, or a combination of start frame/sample number and 
number of frames/samples corresponding to duration. 

Claim 6 (previously presented): A compressed or uncompressed audio data feature 
description scheme according to claim 4, wherein 

said audio segment information is described by any of a combination of start time code and 
end time code, a combination of start time code and duration, a combination of start frame/sample 
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number and an end frame/sample number, or a combination of start frame/sample number and 
number of frames/samples corresponding to duration. 

Claim 7 (previously presented): A compressed or uncompressed audio data feature 
description scheme, wherein 

an audio program is described through one or more hierarchies; 

an audio feature of each hierarchy is represented by an audio thumbnail indicating either one 
or more audio pieces or images; 

the audio thumbnail is declared and described as a feature type; 

if the audio thumbnail is the audio pieces, segment information of one or more audio pieces 
are described; and 

if the audio thumbnail is the images, one or more file names of the images are described. 

Claim 8 (previously presented: A compressed or uncompressed audio data feature 
description scheme, wherein 

an audio feature of at least one audio scene or one audio shot is represented by an audio clip 
which is at least one audio piece having an arbitrary length equal to or shorter than that of the audio 
scene or the audio shot, respectively; 

said audio scenes and/or audio shots are described through one or more hierarchies. 

Claim 9 (previously presented): A compressed or uncompressed audio data feature 
description scheme according to claim 8, wherein 
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the key audio clip is declared and described as a feature type; 

if an audio type of the key audio clips is sound, a sound representing the key audio clips is 
represented as the key sound; 

the key sound is declared and described as a feature sub type; and 

at least one audio segment corresponding to the key sound is described. 

Claim 13 (previously presented): A compressed or uncompressed audio data feature 
description scheme, wherein 

if audio data consists of multiple channels or tracks, a representative channel or track of the 
audio data is represented as the key stream; 

the key stream is declared and described as a feature type; and 

at least one audio segment corresponding to the key stream is described. 

Claim 14 (previously presented): A compressed or uncompressed audio data feature 
description scheme, wherein 

an audio clip representing an event in audio data is represented as the key event; 

the key event is declared and described as a feature type; 

a content of the key event is described by textual information; and 

at least one audio segment corresponding to the key event is described. 

Claim 15 (previously presented): A compressed or uncompressed audio data feature 
description scheme, wherein 
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an audio clip from a representative audio source in audio data is represented as the key 

object; 

the key object is declared and described as a feature type; 

a content of the key object is declared and described by textual information; and 

at least one audio segment corresponding to the key object is described. 

Claim 16 (previously presented): A compressed or uncompressed audio data feature 
description scheme, wherein 

an audio program is described through one or more hierarchies; 

at least one introduction or representative audio piece of each hierarchy corresponding to an 
audio program, an audio scene or an audio shot is represented as an audio segment; 
a sequence of the audio segments is represented as an audio slide; 
the audio slide is declared and described as a feature type; and 
the audio segments composing the audio slide are described. 

Claim 17 (previously presented): A compressed or uncompressed audio data feature 
description scheme, wherein 

an audio program is described through one or more hierarchies; 

at least one introduction or representative audio piece of each hierarchy corresponding to an 
audio program, an audio scene or an audio shot is saved as an audio file; 
a sequence of the audio files is represented as an audio slide; 
the audio slide is declared and described as a feature type; and 
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file names of the audio files composing the audio slide are described. 

Claim 18 (previously presented): A compressed or uncompressed audio data feature 
description scheme, wherein 

if a feature type is any of a shot, a key audio clip, a key word, a key note, or a key sound, 
value indicating level of the feature types is described; and 

multiple audio data with said feature types are described hierarchically according to the level 

values. 

Claim 19 (previously presented): A compressed or uncompressed audio video data feature 
collection description scheme, wherein 

feature descriptions based on various feature types are associated with each audio video 
program; 

the feature descriptions are extracted from multiple audio video programs based on a specific 
feature type; 

a feature collection description is constructed by using multiple extracted feature 
descriptions; and 

the feature collection description is described as a feature collection description file. 

Claim 20 (previously presented): A compressed or uncompressed audio video data feature 
collection description scheme according to claim 19, wherein 
the feature type is a summary type; 
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summary descriptions associated with each audio video programs are extracted from multiple 
audio video programs based on a specific summary type; 

a summary collection is aggregated using multiple extracted summary descriptions; and 
the summary collection is described as a summary collection description file. 

Claim 21 (previously presented): A compressed or uncompressed audio video data feature 
collection description scheme according to claim 19, wherein 

as an element for describing the feature collection description in the feature collection 
description file, the feature types for feature collection descriptions and contents of the feature types 
are described at a higher level; and 

the audio video program identifiers referred to by each feature description and each 
corresponding segment information in the audio video programs are described. 

Claim 22 (previously presented): A compressed or uncompressed audio video data feature 
collection description scheme according to claim 21, wherein 

if the feature is a summary of audio video data, summary types for summary collection and 
contents of the summary types are described at a higher level as an element for describing the 
summary collection in the summary collection file; 

the audio video program identifiers referred to by each summary description and each 
corresponding segment information in the audio video programs are described at a lower level. 
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Claim 23 (previously presented): A compressed or uncompressed audio video data feature 
collection description scheme according to claim 19, wherein 

the feature types for feature collection descriptions and contents of the feature types are 
described altogether in a nested structure, whereby the feature collection can be described based on 
different feature types, or based on different contents among the same feature type. 

Claim 24 (previously presented): A compressed or uncompressed audio video data feature 
collection description scheme according to claim 21, wherein 

the feature types for feature collection descriptions and contents of the feature types are 
described altogether in a nested structure, whereby the feature collection can be described based on 
different feature types, or based on different contents among the same feature type. 

Claim 25 (previously presented): A compressed or uncompressed audio video data feature 
collection description scheme according to claim 20, wherein 

the summary types for summary collection descriptions and contents of the summary types 
are described altogether in a nested structure, whereby the summary collection can described based 
on different summary types, or based on different contents among the same summary type. 

Claim 26 (previously presented): A compressed or uncompressed audio video data feature 
collection description scheme according to claim 22, wherein 
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the summary types for summary collection descriptions and contents of the summary types 
are described altogether in a nested structure, whereby the summary collection can described based 
on different summary types or based on different contents among the same summary type. 
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